Re-Engineering Letter-to-Sound Rules

نویسنده

  • Martin Jansche
چکیده

Using finite-state automata for the text analysis component in a text-to-speech system is problematic in several respects: the rewrite rules from which the automata are compiled are difficult to write and maintain, and the resulting automata can become very large and therefore inefficient. Converting the knowledge represented explicitly in rewrite rules into a more efficient format is difficult. We take an indirect route, learning an efficient decision tree representation from data and tapping information contained in existing rewrite rules, which increases performance compared to learning exclusively from a pronunciation lexicon.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Welsh letter-to-sound rules: rewrite rules and two-level rules compared

In a text-to-speech synthesis system, input words not found in the system's lexicon are passed to letter-to-sound rules, which derive the word's pronunciation. In Welsh, the letter-to-sound rules must be applied in three passes: firstly, to add epenthetic vowels, secondly, to determine stress and vowel location, and thirdly, to perform grapheme-to-phoneme conversion. To begin with, all these le...

متن کامل

Morphoogicai Decomposition and Stress Assignment for Speech Synthesis

Both appt~oaches have their advantages and disadvantages; dictionary lookup fails for unknown words (e.g., proper nouns) and letter to sound rules fail for irregular words, which are all too common in English. Most speech synthesizers adopt a hybrid strategy, using the dictionary when possible and turning to letter to sound rules for the rest. I discussed letter to sound rules at the last meeti...

متن کامل

Morphology and rhyming: two powerful alternatives to letter-to-sound rules for speech synthesis

Most speech synthesizers have tended to depend on letter-to-sound rules for most words, and resort to a small ‘‘exceptions dictionary’’ of about 5000 words to cover the more serious gaps in the letterto-sound rules. The Bell Laboratories Text-toSpeech system, TTS, takes a radical dictionarybased approach; dictionary methods (with morphological and analogical extensions) are used for the vast ma...

متن کامل

Performance Comparison between Human Engineered & Machine Learned Letter-to-sound Rules for English: a Machine Learning Success Story

The task of mapping spelled English words into strings of phonemes and stresses (\reading aloud") has many practical applications. Several commercial systems perform this task by applying a knowledge base of expert-supplied letter-to-sound rules. This paper presents a set of machine learning methods for automatically constructing letter-to-sound rules by analyzing a dictionary of words and thei...

متن کامل

Phonological Processing for Urdu Text to Speech System

Determining and modeling phonological phenomena is necessary to generate speech from textual input. These phenomena include letter to sound conversion, syllabification, sound change, stress assignment and intonation assignment. This paper presents work on Urdu phonological processes and provides algorithms to convert textual input into phonologically annotated output, required for Urdu text-to-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001